Skip to content

Allow immediately resolving cycles with a fallback instead of fixpoint #798

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Veykril
Copy link
Member

@Veykril Veykril commented Apr 16, 2025

In rust-analyzer we have queries that may cause cycles where the cycle itself is a an error that should result in some default. Immediately falling back in a fixpoint cycle is tricky to get right as the cycle tends to not converge and in fact, we do not need the single extra iteration.

This proposes adding a third cycle handling mode that populates an initial value to be used as the return value of the cycle hitting query.

Copy link

netlify bot commented Apr 16, 2025

Deploy Preview for salsa-rs ready!

Name Link
🔨 Latest commit 9570e80
🔍 Latest deploy log https://app.netlify.com/sites/salsa-rs/deploys/680377b8f0253a00084567e9
😎 Deploy Preview https://deploy-preview-798--salsa-rs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codspeed-hq bot commented Apr 16, 2025

CodSpeed Performance Report

Merging #798 will degrade performances by 4.9%

Comparing Veykril:veykril/push-wqwvkrpytswm (9570e80) with master (ab7ecb4)

Summary

❌ 1 regressions
✅ 11 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
amortized[SupertypeInput] 4 µs 4.2 µs -4.9%

@Veykril Veykril force-pushed the veykril/push-wqwvkrpytswm branch 2 times, most recently from 9a99895 to 0f8f412 Compare April 16, 2025 13:23
@ChayimFriedman2
Copy link
Contributor

Perhaps it is better to have a different name than cycle_initial, that implies something after the initial?

Copy link
Member

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in 2025-04-16 meeting, but let's write cycle_result = X.

@Veykril Veykril force-pushed the veykril/push-wqwvkrpytswm branch from 0f8f412 to 196f597 Compare April 16, 2025 15:14
@carljm
Copy link
Contributor

carljm commented Apr 16, 2025

Should we add documentation for this in the book?

I think one thing that should be documented there is that using cycle_result is likely to cause non-deterministic results for a given query graph depending on the entry point.

Avoiding this is the reason the fallback feature of fixpoint iteration requires that a single extra iteration must converge.

@ChayimFriedman2
Copy link
Contributor

I think one thing that should be documented there is that using cycle_result is likely to cause non-deterministic results for a given query graph depending on the entry point.

@Veykril can this cause problems for us? Definitely not for things that emit an error for cycles (e.g. layouts), since we'll just propagate the error, but what about stuff like generic_defaults()?

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Barring one error message, this looks reasonable to me, as long as we are OK with the potential non-determinism.

(To be clear, non-determinism is also possible with fixpoint iteration, if you define the semantics of your queries badly. But the convergence requirement means you have to work significantly harder to get non-determinism, whereas with cycle_result non-determinism is likely to be the default outcome.)

)),
(_, _, Some(_)) => Err(syn::Error::new_spanned(
self.args.cycle_initial.as_ref().unwrap(),
"must provide either `cycle_result` or `cycle_fn` along with `cycle_initial`, not both",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this error message doesn't seem right, you should not provide either cycle_initial or cycle_fn if you provide cycle_result

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also confused by this, until I realized this can be read as "provide either cycle_result or (cycle_fn along with cycle_initial)", but probably better to reword this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I did not realize the ambiguity in this statement

@davidbarsky
Copy link
Contributor

Barring one error message, this looks reasonable to me, as long as we are OK with the potential non-determinism.

(To be clear, non-determinism is also possible with fixpoint iteration, if you define the semantics of your queries badly. But the convergence requirement means you have to work significantly harder to get non-determinism, whereas with cycle_result non-determinism is likely to be the default outcome.)

The non-determinism comes from the lack of convergence by design, right?

@carljm
Copy link
Contributor

carljm commented Apr 16, 2025

The non-determinism comes from the lack of convergence by design, right?

I guess? Not entirely sure what you mean. But yes, non-determinism is an inevitable result of allowing the use of non-converging results in a cycle. So if you intend all of that, then it is all by design :)

@Veykril
Copy link
Member Author

Veykril commented Apr 17, 2025

Hm, is this really non-determinism? I mean from what I gather here, the result may potentially depend on the entry point, that is the order of execution. Ah, I guess that becomes non-deterministic when multiple threads are involved in the cycle?

@Veykril
Copy link
Member Author

Veykril commented Apr 17, 2025

Hmm, actually I think there is no non-determinism here (unlike with the immediate one-iteration fixpoint). I believe the order we enter a mutli-thread cycle does not matter, or rather, we can't enter a multi-thread cycle with this scheme as the cycle immediately resolves?

Though I haven't thought about what happens if we enter a fixpoint cycle that itself runs into a fallback cycle. That one might be problematic.

Either way I've added a parallel test based on cycle_a_t1_b_t2.rs for the fallback mechanism.

@Veykril Veykril force-pushed the veykril/push-wqwvkrpytswm branch from 196f597 to 7a2d3bf Compare April 17, 2025 09:59
@Veykril
Copy link
Member Author

Veykril commented Apr 17, 2025

Also of note is that this scheme still differs from what salsa used to:

  • Before, salsa replaced the result of the query that contained a cycle trigger with the fallback
  • With this, salsa just returns the result of the cycle invocation, so the query that contains the cycle trigger continues to run with this provisional value instead.

@Veykril
Copy link
Member Author

Veykril commented Apr 17, 2025

@carljm Can you double check whether my comments are correct with your understanding of the fixpoint logic?

I believe this to not introduce non-determinism (if my understanding of things is right), in which case I'd merge this without the need to document non-determinism.

@carljm
Copy link
Contributor

carljm commented Apr 17, 2025

Yes, the difference you describe between this PR and previous Salsa fallback behavior is because previous Salsa was going to some lengths to avoid non-determinism, by replacing the result of every query in the cycle with a fallback value.

I don't think the non-determinism here has any dependency on multi-threading, it's simply dependent on entry-point into the cycle, which can vary even given a single thread. Consider two queries a and b where a returns 10 * b() and b returns 1 + a(), and they both have a cycle_result of 0. If you happen to execute a first, you will memoize final results of 10 for a() and 1 for b. If you happen to execute b first, you will memoize final results of 0 for a and 1 for b. This is non-determinism. The value memoized for a is not deterministic based on the definitions of the queries and the values of inputs; it depends also on the order in which queries were executed.

In terms of how this occurs in real usage scenarios, I don't know as much about r-a, but for red-knot, much of our analysis is lazy, so if we have a third-party module we aren't checking directly, we will query just the particular definitions we need from it. So with this kind of non-determinism, the types we infer for things in that module could change, even if that module and its dependencies don't change, depending on what we happen to ask for from it first (which could depend on some details of the first-party module we are checking.)

I guess this behavior is still deterministic if you consider the entire execution context as part of the "inputs", but it is non-deterministic in the sense that I think Salsa has usually used the term, in which the results of any given query subgraph should depend only on those queries and their inputs.

Perhaps there are salsa users for whom this behavior is fine. I don't object to offering it as an option, just think it should be clearly documented, because in the past Salsa has gone to some lengths to avoid it.

@ChayimFriedman2
Copy link
Contributor

@Veykril Can't we do the same as old Salsa? That is, replacing the value of the cycle head with the fallback, and discarding every other query?

@Veykril
Copy link
Member Author

Veykril commented Apr 19, 2025

Right, I do consider the order of execution to make this deterministic (in a single threaded scenario), but you raised a good point, the order of queries that may depend on cycle set queries can give surprising results in some setups. I wonder if there is a some middle ground that can be struck here that would fulfill rust-analyzer's needs without incurring this problem.

Either way I agree in that case we should document this.

@Veykril
Copy link
Member Author

Veykril commented Apr 19, 2025

Hmm, now I do wonder though. This behavior specifically requires 2 or more cycle allowing queries. If the cycle only consists of one query type the result will still be deterministic, given there is only one such entry point. I believe that one alone is all rust-analyzer needs so I wonder if we can diagnose the other cases.

@Veykril Veykril force-pushed the veykril/push-wqwvkrpytswm branch from f0c7881 to d3e13b4 Compare April 19, 2025 08:39
@Veykril Veykril force-pushed the veykril/push-wqwvkrpytswm branch from d3e13b4 to 5b31c65 Compare April 19, 2025 09:03
@Veykril
Copy link
Member Author

Veykril commented Apr 19, 2025

Hmm, okay looking at the rust-analyzer failures from before that limitations won't work for us it seems ... Maybe we should remodel our relevant queries here to become fallible for cycles, so that we can converge to an error instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants